Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport to 2.17.x: #7566: Improve transaction check in refresh_cagg #7593

Closed
wants to merge 1 commit into from

Conversation

timescale-automation
Copy link

This is an automated backport of #7566: Improve transaction check in refresh_cagg.
The original issue is #6533.
This PR will be merged automatically after all the relevant CI checks pass. If this fix should not be backported, or will be backported manually, just close this PR. You can use the backport branch to add your changes, it won't be modified automatically anymore.

For more details, please see the documentation

Original description

Improve transaction check in refresh_cagg

Intro: Hi, I was investigating an issue with portal snapshots (0) did not account for all active snapshots (1) error inside another Postgres extension (wdb-97, unrelated to Timescale) and stumbled upon the #6533 issue in Timescale that was reporting the same error. Decided to have a deeper look into it.

This PR allows better error messages to be reported from refresh_continuous_aggregate when it is called from an atomic (no transaction allowed) context. One of the following messages:

  • ERROR: refresh_continuous_aggregate() cannot run inside a transaction block
  • ERROR: refresh_continuous_aggregate() cannot be executed from a function

is reported now instead of: ERROR: portal snapshots (N) did not account for all active snapshots (N+1). There are no other changes to refresh_continuous_aggregate logic.

Longer description, also included in process_utility.h:

Procedures that use multiple transactions cannot be run in a transaction block (from a function, from dynamic SQL) or in a subtransaction (from a procedure block with an EXCEPTION clause). Such procedures use PreventInTransactionBlock function to check whether they can be run.

Though currently such checks are incomplete, because PreventInTransactionBlock requires isTopLevel argument to throw a consistent error when the call originates from a function. This isTopLevel flag (that is a bit poorly named - see below) is not readily available inside C procedures. The source of truth for it - ProcessUtilityContext parameter is passed to ProcessUtility hooks, but is not included with the function calls. There is an undocumented SPI_inside_nonatomic_context function, that would have been sufficient for isTopLevel flag, but it currently returns false when SPI connection is absent (that is a valid scenario when C procedures are called from top-level SQL instead of PLPG procedures or DO blocks) so it cannot be used.

To work around this the value of ProcessUtilityContext parameter is saved when TS ProcessUtility hook is entered and can be accessed from C procedures using new ts_process_utility_is_context_nonatomic function. The result is called "non-atomic" instead of "top-level" because the way how isTopLevel flag is determined from the ProcessUtilityContext value in standard_ProcessUtility is insufficient for C procedures - it excludes PROCESS_UTILITY_QUERY_NONATOMIC value (used when called from PLPG procedure without an EXCEPTION clause) that is a valid use case for C procedures with transactions. See details in the description of ExecuteCallStmt function.

It is expected that calls to C procedures are done with CALL and always pass though the ProcessUtility hook. The ProcessUtilityContext parameter is set to PROCESS_UTILITY_TOPLEVEL value by default. In unlikely case when a C procedure is called without passing through ProcessUtility hook and the call is done in atomic context, then PreventInTransactionBlock checks will pass, but SPI_commit will fail when checking that all current active snapshots are portal-owned snapshots (the same behaviour that was observed before this change). In atomic context there will be an additional snapshot set in _SPI_execute_plan, see the snapshot handling invariants description in that function.

With initial version of this PR, in TS ProcessUtility hook the saved ProcessUtilityContext value is reset back to PROCESS_UTILITY_TOPLEVEL on normal exit but is NOT reset in case of ereport exit. C procedures can call ts_process_utility_context_reset function to reset the saved value before doing the checks that can result in ereport exit. The scenario when more thorough reset may be necessary - when subsequent calls after the failed atomic call are not passed through the ProcessUtility hook - seems to be unlikely.

Closes #6533.

Disable-check: commit-count

Procedures that use multiple transactions cannot be run in a transaction
block (from a function, from dynamic SQL) or in a subtransaction (from a
procedure block with an EXCEPTION clause). Such procedures use
PreventInTransactionBlock function to check whether they can be run.

Though currently such checks are incompete, because
PreventInTransactionBlock requires isTopLevel argument to throw a
consistent error when the call originates from a function. This
isTopLevel flag (that is a bit poorly named - see below) is not readily
available inside C procedures. The source of truth for it -
ProcessUtilityContext parameter is passed to ProcessUtility hooks, but
is not included with the function calls. There is an undocumented
SPI_inside_nonatomic_context function, that would have been sufficient
for isTopLevel flag, but it currently returns false when SPI connection
is absent (that is a valid scenario when C procedures are called from
top-lelev SQL instead of PLPG procedures or DO blocks) so it cannot be
used.

To work around this the value of ProcessUtilityContext parameter is
saved when TS ProcessUtility hook is entered and can be accessed from
C procedures using new ts_process_utility_is_context_nonatomic function.
The result is called "non-atomic" instead of "top-level" because the way
how isTopLevel flag is determined from the ProcessUtilityContext value
in standard_ProcessUtility is insufficient for C procedures - it
excludes PROCESS_UTILITY_QUERY_NONATOMIC value (used when called from
PLPG procedure without an EXCEPTION clause) that is a valid use case for
C procedures with transactions. See details in the description of
ExecuteCallStmt function.

It is expected that calls to C procedures are done with CALL and always
pass though the ProcessUtility hook. The ProcessUtilityContext
parameter is set to PROCESS_UTILITY_TOPLEVEL value by default. In
unlikely case when a C procedure is called without passing through
ProcessUtility hook and the call is done in atomic context, then
PreventInTransactionBlock checks will pass, but SPI_commit will fail
when checking that all current active snapshots are portal-owned
snapshots (the same behaviour that was observed before this change).
In atomic context there will be an additional snapshot set in
_SPI_execute_plan, see the snapshot handling invariants description
in that function.

Closes #6533.

(cherry picked from commit 85437c0)
@timescale-automation timescale-automation added the is-auto-backport PR created by backport automation label Jan 15, 2025
@timescale-automation timescale-automation enabled auto-merge (rebase) January 15, 2025 15:38
@fabriziomello
Copy link
Contributor

This autobackport is not needed

auto-merge was automatically disabled January 15, 2025 16:50

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is-auto-backport PR created by backport automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants